Recognizing Mathematical Expressions Using Tree Transformation

نویسندگان

  • Richard Zanibbi
  • Dorothea Blostein
  • James R. Cordy
چکیده

We describe a robust and efficient system for recognizing typeset and handwritten mathematical notation. From a list of symbols with bounding boxes the system analyzes an expression in three successive passes. The Layout Pass constructs a Baseline Structure Tree (BST) describing the two-dimensional arrangement of input symbols. Reading order and operator dominance are used to allow efficient recognition of symbol layout even when symbols deviate greatly from their ideal positions. Next, the Lexical Pass produces a Lexed BST from the initial BST by grouping tokens comprised of multiple input symbols; these include decimal numbers, function names, and symbols comprised of nonoverlapping primitives such as “=”. The Lexical Pass also labels vertical structures such as fractions and accents. The Lexed BST is translated into LTEX. Additional processing, necessary for producing output for symbolic algebra systems, is carried out in the Expression Analysis Pass. The Lexed BST is translated into an Operator Tree, which describes the order and scope of operations in the input expression. The tree manipulations used in each pass are represented compactly using tree transformations. The compiler-like architecture of the system allows robust handling of unexpected input, increases the scalability of the system, and provides the groundwork for handling dialects of mathematical notation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Structural Analysis Approach for Online Handwritten Mathematical Expressions

This paper proposes a structural analysis approach for mathematical expressions based on the Attribute String Grammar and the Baseline Tree Transformation approaches. The approach consists of geometrical feature extraction, parsing structure and expression analysis steps. The algorithm for structure parsing uses baselines, which are represented by geometrical features to recursively decompose t...

متن کامل

Layout-based substitution tree indexing and retrieval for mathematical expressions

We introduce a new system for layout-based (LTEX) indexing and retrieving mathematical expressions using substitution trees. Substitution trees can efficiently store and find expressions based on the similarity of their symbols, symbol layout, sub-expressions and size. We describe our novel design and some of our contributions to the substitution tree indexing and retrieval algorithms. We provi...

متن کامل

Performance Metrics and Their Extraction Methods for Audio Rendered Mathematics

We introduce and compare three approaches to calculate structureand content-based performance metrics for user-based evaluation of math audio rendering systems: Syntax Tree alignment, Baseline Structure Tree alignment, and MathML Tree Edit Distance. While the first two require “manual” tree transformation and alignment of the mathematical expressions, the third obtains the metrics without human...

متن کامل

VMEXT: A Visualization Tool for Mathematical Expression Trees

Mathematical expressions can be represented as a tree consisting of terminal symbols, such as identifiers or numbers (leaf nodes), and functions or operators (non-leaf nodes). Expression trees are an important mechanism for storing and processing mathematical expressions as well as the most frequently used visualization of the structure of mathematical expressions. Typically, researchers and pr...

متن کامل

A Bayesian model for recognizing handwritten mathematical expressions

Recognizing handwritten mathematics is a challenging classification problem, requiring simultaneous identification of all the symbols comprising an input as well as the complex two-dimensional relationships between symbols and subexpressions. Because of the ambiguity present in handwritten input, it is often unrealistic to hope for consistently perfect recognition accuracy. We present a system ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEEE Trans. Pattern Anal. Mach. Intell.

دوره 24  شماره 

صفحات  -

تاریخ انتشار 2002